What is pg-query-stream?
The pg-query-stream npm package allows you to stream PostgreSQL query results using Node.js. This is particularly useful for handling large datasets that you don't want to load entirely into memory. It leverages the PostgreSQL client for Node.js (pg) to provide a readable stream interface for query results.
What are pg-query-stream's main functionalities?
Streaming Query Results
This feature allows you to stream the results of a PostgreSQL query. The code sample demonstrates how to set up a client, create a query stream, and pipe the results to the standard output in JSON format.
const { Client } = require('pg');
const QueryStream = require('pg-query-stream');
const JSONStream = require('JSONStream');
async function streamQuery() {
const client = new Client({
user: 'your_user',
host: 'your_host',
database: 'your_database',
password: 'your_password',
port: 5432,
});
await client.connect();
const query = new QueryStream('SELECT * FROM your_table');
const stream = client.query(query);
stream.pipe(JSONStream.stringify()).pipe(process.stdout);
stream.on('end', () => {
client.end();
});
}
streamQuery();
Handling Large Datasets
This feature is useful for handling large datasets by streaming query results directly to a file. The code sample shows how to stream the results of a large table query to an output file.
const { Client } = require('pg');
const QueryStream = require('pg-query-stream');
const fs = require('fs');
async function streamToFile() {
const client = new Client({
user: 'your_user',
host: 'your_host',
database: 'your_database',
password: 'your_password',
port: 5432,
});
await client.connect();
const query = new QueryStream('SELECT * FROM large_table');
const stream = client.query(query);
const fileStream = fs.createWriteStream('output.json');
stream.pipe(fileStream);
stream.on('end', () => {
client.end();
});
}
streamToFile();
Other packages similar to pg-query-stream
pg-cursor
The pg-cursor package provides a way to use PostgreSQL cursors in Node.js. It allows you to fetch rows in batches, which can be useful for processing large datasets without loading them entirely into memory. Unlike pg-query-stream, which provides a readable stream interface, pg-cursor gives you more control over the fetching process by allowing you to specify the number of rows to fetch at a time.
pg-promise
The pg-promise package is a PostgreSQL interface for Node.js that supports promises. It provides a wide range of features, including streaming query results. While pg-query-stream focuses specifically on streaming, pg-promise offers a more comprehensive set of features for interacting with PostgreSQL databases, including query building, transactions, and connection management.
pg-query-stream
Receive result rows from pg as a readable (object) stream.
installation
$ npm install pg --save
$ npm install pg-query-stream --save
requires pg>=2.8.1
use
const pg = require('pg')
var pool = new pg.Pool()
const QueryStream = require('pg-query-stream')
const JSONStream = require('JSONStream')
pool.connect((err, client, done) => {
if (err) throw err
const query = new QueryStream('SELECT * FROM generate_series(0, $1) num', [1000000])
const stream = client.query(query)
stream.on('end', done)
stream.pipe(JSONStream.stringify()).pipe(process.stdout)
})
The stream uses a cursor on the server so it efficiently keeps only a low number of rows in memory.
This is especially useful when doing ETL on a huge table. Using manual limit
and offset
queries to fake out async itteration through your data is cumbersome, and way way way slower than using a cursor.
note: this module only works with the JavaScript client, and does not work with the native bindings. libpq doesn't expose the protocol at a level where a cursor can be manipulated directly
contribution
I'm very open to contribution! Open a pull request with your code or idea and we'll talk about it. If it's not way insane we'll merge it in too: isn't open source awesome?
license
The MIT License (MIT)
Copyright (c) 2013-2020 Brian M. Carlson
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.